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□ 1. Document ID: US 20040111714 A1 

L6: Entry 1 of 7 File: PGPB 



Jun 10, 2004: 



PGPUB-DOCUMENT-NUMBER : 20040111714 
PGPUB- FILING-TYPE: new 

DOCUMENT-IDENTIFIER: US 20040111714 Al 

TITLE: Dynamic division optimization for a just-in- time compiler 
PUBLICATION-DATE: June 10, 2004 



INVENTOR- INFORMATION : 
NAME 

Shi, Xiaohua 
Lueh, Guei-Yuan 
Ying, Zhiwei 



CITY 

Beijing 
San Jose 
Beijing 



STATE 
CA 



COUNTRY 

CN 
US 
CN 



US-CL-CURRENT: 717/148; 717/127, 717/153 
CLAIMS: 

What is claimed is: 

1. A method for improving the performance of a dynamic compiler, comprising: 
receiving a first code; determining a strategy for optimizing a segment of the 
first code; optimizing the segment of the first code using the determined 
optimization strategy; and outputting a second code, representing the optimized 
first code. 

2. The method of claim 1, wherein the first code represents a coit^uter programming 
code; in a high-level programming language. 

3. The method of claim 2, wherein the high-level programming language con^rises a 
compiled Java code. 

4. The method of claim 1, wherein determining the strategy comprises dynaanically 
determining a best strategy available to the compiler to optimize the segment of 
the first code according to characteristics of the segment. 

5. The method of claim 1, wherein optimizing the segment of the first code 
comprises converting the segment in a high-level language to a set of lower-level 
computing instructions native to an underlying computing architecture, selecting a 
best optimization implementation, and optimizing the set of lower-level computing 
instructions based on the determined optimization strategy. 
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38. The system of claim 33, wherein the divisor profiling mechanism comprises: a 
creation component to create a new entry in a divisor cache, if a divisor appears 
for the first time; an initialization component to initialize at least a flag field 
and an optimization parameter field of the newly created entry for the divisor in 
the divisor cache; and a counting component to increment the niomber of occurrences 
of the divisor and record the new number in the divisor counter field of the entry 
for the divisor in the divisor cache. 

39. The system of claim 38, wherein the divisor profiling mechanism further 
comprises: a component to determine whether a divisor becomes invariant for the 
first time during runtime by comparing the number of occurrences of the divisor 
with a pre-set trigger number; a component to select a best optimization 
implementation for a division code with the divisor; and a component to send 
requests to the optimization preparation mechanism to prepare optimization 
parameters for the divisor, if the divisor becomes invariant for the first time 
during runtime. 

40. The system of claim 39, wherein the divisor profiling mechanism further 
comprises a component to store the prepared optimization parameters in the divisor 
cache for the divisor, and to replace the value of the divisor's flag field with 
the divisor in the divisor cache. 

41. The system of claim 33, wherein the optimization preparation mechanism 
comprises: a component to prepare optimization parameters required by the selected 
optimization implementation; a component to pass optimization parameters to the 
divisor profiling mechanism to update a divisor cache, and to the selected 
optimization implementation to invoke the selected optimization implementation. 



n 2. Document ID: US 20030105942 A1 

L6: Entry 2 of 7 , File: PGPB Jun 5, 2003 

PGPUB-DOCUMENT-NUMBER : 20030105942 
PGPUB-FILING-TYPE: new 

DOCUMENT-IDENTIFIER: US 20030105942 Al 
TITLE: Aggressive prefetch of address chains 
PUBLICATION-DATE: June 5, 2003 
INVENTOR-INFORMATION: 

NAME CITY STATE COUNTRY 

Damron, Peter C. Fremont CA US 

Kosche, Nicolai San Francisco CA US 

US-CL-CURRENT : 712/216; 712/207 
CLAIMS: 

What is claimed is: 

1. In a scheduler for computer code wherein certain operations are likely to stall 
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execution of the computer code and thereby provide latency for completion of one or 
more pre^executable operations, a method of scheduling certain of the operations, 
the method comprising: for one or more sequences of operations that follow a 
speculation boundary and that define respective dependency chains, including pre- 
executable operations, which lead to likely stalls, representing speculative copies 
thereof as duplicate chains; and scheduling operations of the computer code, 
wherein the scheduling of operations from the duplicate chains is performed without 
regard to dependence of respective original operations on the speculation boundary, 
thereby scheduling certain of the operations above the speculation boundary into 
position preceding at least one of the operations likely to stall execution of the 
computer code. 

2. A method, as recited in claim 1, wherein the likely stalls include likely cache 
misses. 

3. A method, as recited in claim 1, wherein the dependency chains include address 
chains leading to memory access operations likely to miss in a cache. 

4. A method, as recited in claim 1, wherein the pre-executable operations include 
prefetch instructions. 

5. A method, as recited in claim 1, wherein the pre-executable operations include 
speculative operations. 

6. A method, as recited in claim 1, wherein the operations likely to stall 
execution include memory access instructions. 

7. A method, as recited in claim 1, wherein the operations likely to stall 
execution include operations selected from the set of: a load operation; first use 
of a load operation; a store operation; a branch operation; a multi-cycle 
computational operation; an iterative or recursive operation; a communications 
operation; an input/output (I/O) operation; a synchronization operation; and a co- 
processor operation. 

8. A method, as recited in claim 1, wherein the speculation boundary is defined by 
one of: a store operation; a branch operation; a join operation; an iterative or 
recursive operation; a communications operation; an input/output (I/O) operation; a 
synchronization operation; and a co-processor operation. 

9. A method, as recited in claim 1, further comprising: inserting the pre- 
executable operations into the coir^uter code. 

10. A method, as recited in claim 1, further comprising: profiling the computer 
code to identify the likely stalls. 

11. A method, as recited in claim 1, further comprising: upon reaching the 
speculation boundary, deleting unscheduled operations of the duplicate chains and 
continuing to schedule respective original operations. 

12. A method, as recited in claim 1, further comprising: deleting from the original 
operations, pre-executable operations for which a respective speculative copy is 
scheduled. 

13. A method of hiding latency in computer code wherein certain operations thereof 
are likely to stall execution, the method comprising: identifying sequences of 
operations that define respective original dependency chains that lead to likely 
stalls and for at least some of the identified sequences, representing duplicate 
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prefetch instruction in the execution sequence. 

44. The computer program product of claim 42, further coitprising: a martyr 
instruction that follows the speculative load instruction and the prefetch 
instruction which, upon execution, provides at least a portion of a latency 
therefor. 

45. The computer program product of claim 42, prepared by a program scheduler that 
inserts prefetch instructions into the execution sequence and schedules speculative 
duplicates of at least some load instructions together with corresponding prefetch 
instructions above speculative boundaries therein. 

46. The computer program product of claim 42, wherein the one or more computer 
readable media are selected from the set of a disk, tape or other magnetic, 
optical, semiconductor or electronic storage medium and a network, wireline, 
wireless or other communications medium. 

47. An apparatus comprising: a code preparation facility for transforming 
schedulable code into scheduled code; and means for scheduling speculative copies 
of operations that form dependency chains that lead to a likely stall, the 
scheduling placing the speculative operations above a preceding at least one other 
operation that is itself likely to stall, thereby hiding in the scheduled code 
latency of the speculative operations. 

48. The apparatus of claim 47, further comprising: means for inserting pre- 
executable operations into the schedulable code, wherein at least some of the pre- 
executable operations are scheduled be the scheduling means as the speculative 
operations for which latency is hidden. 

49. The apparatus of claim 47, further comprising: means for identifying likely-to- 
stall operations of schedulable code. 



□ 3. Document ID: US 20030101443 A1 

L6: Entry 3 of 7 File: PGPB May 29, 2003 

PGPUB-DOCUMENT-NUMBER : 20030101443 
PGPUB- FX LING-TYPE: new 

DOCUMENT-IDENTIFIER: US 20030101443 Al 

TITLE: Technique for associating execution characteristics with instructions or 
operations of program code 

PUBLICATION-DATE: May 29, 2003 

INVENTOR- INFORMATION : 
NAME 

Kosche, Nicolai 
Aoki, Christopher P. 
Damron, Peter C. 

US-CL-CURRENT: 717 / 158 
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CITY 

San Francisco 
Los Altos 
Fremont 



STATE COUNTRY 

CA US 

CA US 

CA US 



CLAIMS: 

What is claimed is: 

1. A code preparation method comprising: identifying at least one operation in 
first executable instance of code; executing the first executable instance and 
responsive to detection of an execution event, associating a corresponding 
execution characteristic with a corresponding identified one. of the operations; and 
preparing a second executable instance of the code based, at least in part, on the 
association between the execution characteristic and the identified operation. 

2. The method of claim 1, wherein the operation identification is consistent 
between the first executable instance and the preparation of the second executable 
instance . 

3. The method of claim 2, wherein the consistency of operation identification is 
maintained from preparation of the first executable instance to preparation of the 
second executable instance. 

4. The method of claim 1, wherein Seone unique identification numbers are assigned 
to corresponding operations of the first executable and the second executable. 

5. The method of claim 4, wherein the execution characteristic is associated with 
the unique identification number. 

6. The method of claim 4, wherein the unique identification numbers and their 
assignment to operations are maintained throughout any optimizations or code 
transformations performed in preparation of the first executable. 

7. The method of claim 6, wherein the maintenance of the unique identification 
number assignments include further assigning the unique identification number to a 
copy when an operation is copied as part of a code transformation or optimization. 

8. The method of claim 6, wherein the maintenance of the unique identification 
number assignments includes removing an assignment when the assigned operation is 
removed as part of a code transformation or optimization. 

9. The method of claim 1, wherein the associating of the corresponding execution 

characteristic includes encoding aggregated hardware event information in an 
extended definition of an instruction instance for use in the preparation of the 
second executable instance. 

10. The method of claim 1, wherein the identified operation is a memory access 
instruction. 

11. The method of claim 1, wherein the execution characteristic includes a cache 
miss likelihood. 

12. The method of claim 1, wherein the preparation includes inserting one or more 
prefetch operations in the code prior to the identified operation to exploit 
latency provided by servicing of a cache miss by the identified operation. 

13. The method of claim 1, further comprising : preparing the first executable 
instance. 

14. The method of claim 13, wherein the preparation of the first executable 
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L6: Entry 4 of 7 



File: PGPB 



May 16, 2002 



PGPUB- DOCUMENT-NUMBER: 20020059568 
PGPUB- FILING-TYPE: new 

DOCUMENT-IDENTIFIER: US 20020059568 Al 



TITLE: Program compilation and optimization 
PUBLICATION-DATE: May 16, 2002 



INVENTOR- INFORMATION : 
NAME 

Kawahito, Motohiro 
Ogasawara, Takeshi 
Komatsu, Hideaki 



CITY 

Sagamihara-shi 

Tokyo-to 
Yokohama-shi 



STATE 



COUNTRY 

JP 

JP 

JP 



US-CL-CURRENT : 717/151 



CLAIMS: 



What is claimed is: 



1. A compiler for converting source code for a program written in a programming 
language into object code in a machine language, comprising: an optimization 
execution unit for performing an optimization process for an object program written 
in a machine language; and a program modification unit for modifying said object 
program in order to absorb a difference in content between the point of origin of 
an exception process, which occurs in response to the execution of a command in 
said object program, and a location whereat said exception process is performed. 



2. The coit?>iler according to claim 1, wherein, if there is a difference in content 
between the point of origin of an exception process, which occurs in response to 
the execution of a command in said object program, and a location whereat said 
exception process is performed, said program modification unit generates 
compensation code to compensate for said difference, and inserts said condensation 
code into said object program. 

3. The coir^iler according to claim 1, wherein said program modification unit 
includes: a pre-processor for, before said optimization execution unit performs 
said optimization process, employing a Try node to examine a command that may cause 
an exception process in said object program to determine whether an exception 
process has occurred, and a Catch node for performing an inherent process when it 
is found an exception process has occurred; and a post-processor for examining, in 
said object program that has been optimized by said optimization execution unit, 
said command that may cause an exception process to determine whether a difference 
in content exists between said command that may cause said exception process and a 
location whereat said exception process is performed, and for, when a difference 
exists, generating in said Catch node a compensation code, to be used to compensate 
for said difference, and a code for, after said compensation code is obtained, 
moving program control to said location whereat said exception process is 
performed • 



4. The compiler according to claim 1, wherein, before said optimization execution 
unit performs said optimization process in said object program, said program 
modification unit divides said command that may cause an exception process into a 
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has occurred, and that includes a command for, when an exception process has 
occurred, moving program control to a portion whereat said exception process is 
performed; and a process for, when a difference in content exists between the point 
of origin of said exception process and said portion whereat said exception process 
is performed, generating in said basic block condensation code for compensating for 
said difference. 

16. A program transmission apparatus comprising: storage means for storing a 
program that permits a computer to perform a process for preparing a basic block 
that includes a portion for examining a command, in an object program, that may 
cause an exception process, in order to determine whether an exception process has 
occurred, and that includes a command for, when an exception process has occurred, 
moving program control to a portion whereat said exception process is performed, 
and a process for, when a difference in content exists between the point of oirigin 
of said exception process and said portion whereat said exception process is 
performed, generating in said basic block condensation code for compensating for 
said difference; and transmission means for reading said program from said storage 
means and for transmitting said program. 




□ 5. Document ID: US 5404655 A 

L6: Entry 5 of 7 File: USPT Apr 4, 1995 

US-PAT-NO: 5404555 

DOCUMENT-IDENTIFIER: US 5404555 A 

TITLE: Macro instruction set computer architecture 

DATE-ISSUED: April 4, 1995 

INVENTOR- INFORMATION : 
NAME CITY 
Liu; Dali Beijing 

US-CL-CURRENT : 712/36; 712/202 
CLAIMS: 

What is claimed is: 

1. A macro-instruction set computer achictecture con^rising: 

main memory means for storing system softwares of the computer, instructions 
and user programs; 

first memory means for storing preparatory data for operation, intermediate 
results of operation and final results of completed operation, and operating 
in the form of a stack; 

second memory means for storing a break point address of subprograms and 
address for recovery of a break point while returning from call, and operating 
in the form of a stack; and 



STATE ZIP CODE COUNTRY 

CN 
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